-
-
Notifications
You must be signed in to change notification settings - Fork 157
Drop ANN exceptions from *arrays* and core/base.py #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Drop ANN exceptions from *arrays* and core/base.py #1568
Conversation
Co-authored-by: Yi-Fan Wang <cmp0xff@users.noreply.github.com>
cmp0xff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reviewed a small portion of all changes.
| storage_options: StorageOptions = ..., | ||
| engine_kwargs: dict[str, Any] | None = ..., | ||
| ) -> None: ... | ||
| def __fspath__(self): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is functional: https://docs.python.org/3/library/os.html#os.PathLike.__fspath__
should return a string, according to the implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since not documented, the question is do we need it, happy to revert and mark the return type as str, let me know your view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it will be useful if ExcelFile is to be taken as a Path-like object. Would prefer to keep.
| skipna: bool = ..., | ||
| ): ... | ||
| def median( | ||
| def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think __new__ is preferred over __init__. It is easier to understand the return types in __new__ than the typing hint of self in __init__.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| freq: Frequency | None = None, | ||
| copy: bool = False, | ||
| ) -> None: ... | ||
| __rmul__ = ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also drop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay will drop, I wanted to keep it concise but happy to do more.
pandas-stubs/core/arrays/period.pyi
Outdated
| class PeriodArray(DatetimeLikeArrayMixin, DatelikeOps): | ||
| __array_priority__: int = ... | ||
| def __init__(self, values, freq=..., dtype=..., copy: bool = ...) -> None: ... | ||
| def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__new__ instead of __init__?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
pandas-stubs/core/arrays/period.pyi
Outdated
| def __init__(self, values, freq=..., dtype=..., copy: bool = ...) -> None: ... | ||
| def __init__( | ||
| self, | ||
| values: Series[PeriodDtype] | npt.NDArray[np.integer] | PeriodIndex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| values: Series[PeriodDtype] | npt.NDArray[np.integer] | PeriodIndex, | |
| values: np_ndarray_anyint | PeriodArray | PeriodIndex | Series[Period], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| Ellipsis = "..." | ||
|
|
||
| class SparseArray(ExtensionArray, ExtensionOpsMixin): | ||
| def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__new__ instead of __init__?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it makes a difference here since we have only one value.
| sparse_index: SparseIndex | None = None, | ||
| fill_value: Scalar | None = None, | ||
| kind: str = "integer", | ||
| dtype: np.dtype | SparseDtype | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to distinguish >= 3.11 (np.dtype) and 3.10 (np.dtype[Any])
| @property | ||
| def sp_index(self): ... | ||
| @property | ||
| def sp_values(self): ... | ||
| @property | ||
| def dtype(self): ... | ||
| @property | ||
| def fill_value(self): ... | ||
| @fill_value.setter | ||
| def fill_value(self, value) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing them might not be a good idea. They seem to be the core use case of SpareArray. We can raise issues in Pandas for the proper documentation.
pandas-stubs/core/arrays/integer.pyi
Outdated
| if sys.version_info >= (3, 11): | ||
| def __init__( | ||
| self, values: AnyArrayLike, mask: np_1darray, copy: bool = ... | ||
| ) -> None: ... | ||
| else: | ||
| def __init__( | ||
| self, values: AnyArrayLike, mask: np.ndarray, copy: bool = ... | ||
| ) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self, values: AnyArrayLike, mask: np_ndarray, copy: bool = ... would do in all cases
| def month_name(self, locale=...): ... | ||
| def day_name(self, locale=...): ... | ||
| ) -> Self: ... | ||
| def to_pydatetime(self) -> npt.NDArray[np.object_]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe np_1darray_object, also in other cases here
| is_year_end = ... | ||
| is_leap_year = ... | ||
| def to_julian_date(self): ... | ||
| def to_julian_date(self) -> npt.NDArray[np.float64]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe np_1darray_float
|
should the title be updated to mention |
| def __new__( | ||
| cls, | ||
| values: np_ndarray_anyint | PeriodArray | PeriodIndex | Series[Period], | ||
| freq: Frequency | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be PeriodFrequency?
MarcoGorelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on this!
some methods like isna are inherited from ExtensionArray, but could you elaborate on why the others are being removed? is it that they're not documented?
| def any(self, *, skipna: bool = ..., **kwargs: Any): ... | ||
| def all(self, *, skipna: bool = ..., **kwargs: Any): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if i missed it but what happened to these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any and all are undocumented, but personally I would prefer to keep them as well.
| def isna(self): ... | ||
| def shift(self, periods: int = 1, fill_value=...): ... | ||
| def unique(self): ... | ||
| def value_counts(self, dropna: bool = True): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove value_counts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe keeping it would be better
| self, key: SequenceIndexer | tuple[int | ellipsis, ...] | ||
| ) -> Self: ... | ||
| def copy(self): ... | ||
| def map(self, mapper): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe keeping it would be better
| def isna(self): ... | ||
| def shift(self, periods: int = 1, fill_value=...): ... | ||
| def unique(self): ... | ||
| def value_counts(self, dropna: bool = True): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe keeping it would be better
| self, key: SequenceIndexer | tuple[int | ellipsis, ...] | ||
| ) -> Self: ... | ||
| def copy(self): ... | ||
| def map(self, mapper): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe keeping it would be better
| def any(self, *, skipna: bool = ..., **kwargs: Any): ... | ||
| def all(self, *, skipna: bool = ..., **kwargs: Any): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any and all are undocumented, but personally I would prefer to keep them as well.
| if sys.version_info >= (3, 11): | ||
| def argsort( | ||
| self, *, ascending: bool = ..., kind: str = ..., **kwargs: Any | ||
| ) -> npt.NDArray[np.intp]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just use one case and return
| ) -> npt.NDArray[np.intp]: ... | |
| ) -> np_ndarray_intp: ... |
because the returned values seem to be indices, hence intp
| def __eq__(self, other) -> bool: ... | ||
| def __ne__(self, other) -> bool: ... | ||
| def __lt__(self, other) -> bool: ... | ||
| def __gt__(self, other) -> bool: ... | ||
| def __le__(self, other) -> bool: ... | ||
| def __ge__(self, other) -> bool: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Categoricals are indeed comparable, so we need to keep
| def min(self, *, axis=..., skipna: bool = ..., **kwargs: Any): ... | ||
| def max(self, *, axis=..., skipna: bool = ..., **kwargs: Any): ... | ||
| def mean(self, *, skipna: bool = ...): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe keeping them is better
| def __init__(self, values, dtype=..., freq=..., copy: bool = ...) -> None: ... | ||
| def __init__( | ||
| self, | ||
| values: Series | Index | DatetimeArray | npt.NDArray[np.object_], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
| values: Series | Index | DatetimeArray | npt.NDArray[np.object_], | |
| values: Series | Index | DatetimeArray | np_ndarray_str, |
| def day_name(self, locale: str | None = None) -> Series | Index: ... | ||
| @property | ||
| def time(self): ... | ||
| def time(self) -> npt.NDArray[np.object_]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def time(self) -> npt.NDArray[np.object_]: ... | |
| def time(self) -> np_1darray_object: ... |
| def time(self) -> npt.NDArray[np.object_]: ... | ||
| @property | ||
| def timetz(self): ... | ||
| def timetz(self) -> npt.NDArray[np.object_]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def timetz(self) -> npt.NDArray[np.object_]: ... | |
| def timetz(self) -> np_1darray_object: ... |
| def timetz(self) -> npt.NDArray[np.object_]: ... | ||
| @property | ||
| def date(self): ... | ||
| def date(self) -> npt.NDArray[np.object_]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def date(self) -> npt.NDArray[np.object_]: ... | |
| def date(self) -> np_1darray_object: ... |
|
I may have put a little too much in this PR, will split so we can be more focused and not spread ourselves. |
assert_type()to assert the type of any return value)AGENTS.md.Quite a bit of clean up to comply with ANN rules.